iT邦幫忙

2022 iThome 鐵人賽

DAY 18
0
自我挑戰組

用Python學習網路爬蟲30天系列 第 18

[Day18] Selenium總複習

  • 分享至 

  • xImage
  •  

用Selenium爬取旅館資訊

以Hotels.com網站為目標網址指定前往的地點、入住/退房時間與人數,用Selenium物件擷取出飯店名稱、所在區域名稱、價錢與平價資訊列印出來並且儲存到hotel.csv檔案中。

import csv
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.service import Service

s = Service("chromedriver.exe")
driver = webdriver.Chrome(service=s)
driver.implicitly_wait(10)
driver.get("https://tw.hotels.com")

#XPath定位出前往地點欄位
location = driver.find_element(By.XPATH, "//button[@class='uitk-fake-input uitk-form-field-trigger']")
location.click()
city = driver.find_element(By.XPATH, "//input[@id='destination_form_field']")
city.send_keys("台北")
city.send_keys(Keys.ENTER);
time.sleep(3)
#XPath定位出搜尋按鈕
search =  driver.find_element(By.XPATH, "//button[@id='submit_button']")
search.click()
time.sleep(3)
#XPath定位出顯示更多按鈕
more = driver.find_element(By.XPATH, "//button[@class='uitk-button uitk-button-medium uitk-
button-secondary']")
more.click()
time.sleep(3)
#XPath定位飯店相關資料欄位
hotels = driver.find_elements(By.XPATH, "//div[@class='uitk-card uitk-card-roundcorner-all uitk-card-has-primary-theme']")
   
# 開啟CSV檔案寫入截取的資料
with open("hotel.csv", 'w+', newline='', encoding="utf-8") as fp:
    writer = csv.writer(fp)
    for hotel in hotels:
        #HTML標籤定位飯店名稱、地區、價錢、評價資訊
        name = hotel.find_element(By.CLASS_NAME, "uitk-heading-5")
        place = hotel.find_element(By.CLASS_NAME, "uitk-text")
        price = hotel.find_element(By.CLASS_NAME, "uitk-type-600")
        star = hotel.find_element(By.CLASS_NAME, "uitk-spacing-padding-inlineend-one")
        rowList = [name.text, place.text, price.text, star.text]         
        writer.writerow(rowList)
        #印出擷取資料
        print("飯店: " + name.text)
        print("區域: " + place.text)
        print("價錢: "+ price.text)
        print("評價: "+ star.text)
        print('————————————————————————————————————————————')
        
driver.quit()


執行結果:
https://ithelp.ithome.com.tw/upload/images/20221002/20152180nf1kJ4RNpS.png

從excel開啟csv檔案 :
詳細步驟參考(https://www.managertoday.com.tw/articles/view/55615 )
https://ithelp.ithome.com.tw/upload/images/20221002/20152180ST6VC5n2PB.png


上一篇
[Day17] 動態網頁擷取4_JavaScript動態網頁擷取
下一篇
[Day19] Scrapy爬蟲框架
系列文
用Python學習網路爬蟲30天30
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言